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Abstract 

The problem addressed is to design a detector which is maximally sensitive to specific quantum 
states. Here we concentrate on quantum state detection using the w0r5f-cfl.se a posteriori probability 
of detection as the design criterion. This objective is equivalent to asking the question: if the detector 
declares that a specific state is present, what is the probability of that state actually being present? 
We show that maximizing this worst-case probability (maximizing the smallest possible value of this 
probability) is a quasiconvex optimization over the matrices of the POVM (positive operator valued 
measure) which characterize the measurement apparatus. We also show that with a given POVM, the 
optimization is quasiconvex in the matrix which characterizes the Kraus operator sum representation 
(OSR) in a fixed basis. We use Lagrange Duality Theory to establish the optimality conditions for 
both deterministic and randomized detection. We also examine the special case of detecting a single 
pure state. Numerical aspects of using convex optimization for quantum state detection are also 
discussed. 
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1 Introduction 



why Information is extracted from a quantum system by measurement. The most information that it is 
possible to extract from the quantum system is given by its state, specified by a density operator, and it 
is impossible to determine this from a single measurement. The problem of detecting information stored 
in the state of a quantum system is therefore a fundamental problem in quantum information theory. 
The nature of this problem is essentially the design of measurements such that they yield the optimum 
information for the specified purpose. That is, the construction of matrices representing positive operator 
valued measures (POVMs) which give the best performance against a given set of criteria, subject to 
constraints reflecting the underlying properties of the quantum mechanics, or costs associated with the 
implementation of certain operations. 

The emergence of quantum information processing has raised important new issues, and made more 
urgent the development of tools for the design of quantum measurements. In this paper we present a gen- 
eral formalism that enables this design across a wide range of applications. In particular we show that the 
problem may be cast in the form of a convex optimization over the possible POVMs, and that this allows 
powerful numerical tools to identify the globally optimal measurements to achieve the desired objective. 
Such optimizations are useful even if they turn out to be difficult to implement in the laboratory, since 
they provide a benchmark for the performance of experimentally feasible measurements. 

The objective of a measurement in quantum information theory depends on the way in which infor- 
mation is encoded into the quantum system to begin with, and this is in turn, depends on the application. 
In quantum cryptography, for example, the information is encoded by the sender choosing randomly 
between two non-orthogonal bases, both of which can encode a single bit. The ability of an eavesdrop- 
per, who is in principle unable to influence the choice of preparation and measurement bases chosen by 
the sender and receiver of this information, to extract information from the transmitted quantum bits, 
depends on her ability to determine which of four non-orthogonal states were sent. In a quantum infor- 
mation processor, the information in the register at the end of a computation often resides in orthogonal 
states, and the goal of the measurement is simply to read out the register by distinguishing among the 
sets of such states. However, the operation of such a processor may itself depends on measurements. 
For example, quantum error correction protocols require the measurement of an ancilla to preserve the 
quantum state of the register itself. In another example, the cluster computing model [40] and the linear 
optical quantum computer [29], both rely on measurements of ancillary qubits for the operation of the 
logic gates themselves. 

In these examples of conditional state preparation, it is vital that there is a high degree of correlation 
between the outcome of the measurement and the quantum state prepared in the register by the measure- 
ment. The structure of the measurement should therefore be such that this correlation is maximized. Thus 
it is vital to consider the case when the detectors have noise, and to develop strategies for optimizing the 
measurement in the presence of this and inherent inefficiencies. Despite the fundamental inability to de- 
termine the quantum state of a system from a single measurement, it is sometimes useful to make such a 
determination from a set of independent measurements on identically prepared systems. This procedure 
is called quantum state tomography. Similarly, one may characterize the action of a quantum operation 
by determining its effect on a known input quantum state from a determination of the output state. In 
this application, the central questions are: how many measurements are needed to determine the state to 
within a given precision, and how should these measurements be constructed? That is, it is essentially a 
problem of experiment design. Optimal experiment for quantum state tomography and quantum process 
tomography is considered in [30]. 

In this work, we concentrate on the problem of quantum state detection. That is, the design of 
POVMs that can determine whether or not a particular component was present in the input state to the 
detector. The problem is thus equivalent to the design of a quantum channel that optimally transforms 
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the input state distribution (assumed to be given, and including non-orthogonal states) to the output 
measurement outcomes. The channel may be lossy, and may introduce noise, and thus there may be 
latency in the measurement, in which certain outcomes are ambiguous. 

previous work Several approaches have emerged for distinguishing between a collection of non- 
orthogonal quantum states. An accessible review can be found in the article by Chefles [8]. In one 

approach, called quantum hypothesis testing, a measurement is designed to minimize the probability of 
a detection error [25, 23, 44, 17, 6, 35, 2, 15, 16, 14]. Necessary and sufficient conditions for an opti- 
mum measurement maximizing the probability of correct detection have been developed in [17] using a 
semidefinite programming approach, and earlier in [25] (the drawback of this approach is that it does not 
readily lend itself to efficient computational algorithms). Closed-form analytical expressions for the op- 
timal measurement have been derived for several special cases [23, 6, 35, 2, 15, 16]. Iterative procedures 
maximizing the probabihty of correct detection have also been developed for cases in which the optimal 
measurement cannot be found expUcitly [24, 17]. A specific design for achieving the optimal discrimina- 
tion between non-orthogonal coherent states has been given in [3], and for non-orthogonal polarization 
states of a single photon by [4]. Optimal discrimination amongst more than two non-orthogonal states 
has also been analyzed [39] and demonstrated experimentally [10]. 

A more recent approach, referred to as unambiguous detection [27, 1 1, 36, 28, 38, 7, 9, 13, 12, 18], is 
to design a measurement that with a certain probability returns an inconclusive result, but such that if the 
measurement returns an answer, then the answer is correct with probability 1. Chefles [7] showed that 
a necessary and sufficient condition for the existence of unambiguous measurements for distinguishing 
between a collection of pure quantum states is that the states are linearly independent. Necessary and 
sufficient conditions on the optimal measurement minimizing the probability of an inconclusive result for 
pure states were derived in [13]. The optimal measurement when distinguishing between a broad class of 
symmetric pure-state sets was also considered in [13]. The problem of unambiguous detection between 
mixed state ensembles was first considered in [41]. Necessary and sufficient optimaUty conditions for 
unambiguous mixed state detection were developed in [18]. 

Experimental configurations may not allow the ideal measurements to be made, and thus the per- 
formance of feasible apparatuses have been analyzed. For example, an apparatus for the unambiguous 
discrimination between two orthogonal states of a single photon using homodyne detection, rather than 
photon counting, which has higher losses and more noise, has been examined in [22] and [33]. An exper- 
imental implementation of the process for discriminating unambiguously between two non-orthogonal 
states of polarization of a single photon has been demonstrated in [26]. 

An interesting alternative approach for distinguishing between a collection of quantum states, which 
is a combination of the previous two approaches, is to allow for a certain probability of an inconclusive 
result, and then maximize the probability of correct detection [12, 45, 20]. 

what's new here Prior work has considered optimal detector design only for an average measure of 
the probability of detection, such as the average joint probability of detection. For example, in [17, 13] 
it is shown that using this criterion, detector design can be formulated as a convex optimization over the 
matrices in the POVM, specifically a semidefinite program (SDP). Here we concentrate on quantum state 
detection using the worst-case a posteriori probability of detection as the design criterion. This objective 
is equivalent to asking the question: if the detector declares that a specific state is present, what is the 
probability of that state actually being present? We show that maximizing the smallest possible value of 
this probability is a quasiconvex optimization over the POVM matrices, or over the Krause operator-sum- 
representation (OSR) in a fixed basis. Issues relating to conditions of optimahty and numerical aspects 
of convex optimization of state detection are also discussed. 
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We will show that many of the standard measures of detector performance (including those previously 
considered) are also convex functions of the detector design parameters. In addition, we will see that the 
design parameters, either POVM or OSR, are in a convex set. As a result we can cast a number of detector 
design problems as a convex optimization. Details and underlying theory about convex optimization are 
in the text by Boyd and Vandenberghe [5]. As stated there, the great advantage of convex optimization 
is a globally optimal solution can be found efficiently and reUably, and perhaps most importantly, can be 
computed to within any desired accuracy using an interior-point method. 

Another advantage to being able to obtain a globally optimal solution is that the resulting perfor- 
mance can be used as a benchmark against which the initial detector design can be compared. If the 
optimal performance is significantly better, then there is compelling reason to try and implement the 
optimal solution or to try and modify the initial design in the "direction" of the optimal solution, if that 
is clear from the physical implementation. 

In a few instances we use Lagrange DuaUty Theory to derive formulas for direct calculation of the 
optimal objective value and the associated POVM matrices. These calculations only involve singular 
value decomposition of the problem data. 



2 Problem formulation 
2.1 Detector 



A quantum state detector is considered here as an input/output device mapping a state (density matrix) 
p G c'*^" at the input into one of a number of discrete outcomes at the output as illustrated in Figure 1. 



Detector 



d e Dout 



Figure 1: Quantum state detector 
Specifically, the input state is drawn randomly from 

Din = {piGC'^^", 0<pi<l\i = l,...,m} (1) 
where pi is the occurence probabiUty of pi, that is. 

Pi = Prob {p = Pi}, i = l,...,m (2) 

The set of detector outcomes is, 

Dout = {i \ i = 1, . . . ,m,} (3) 

The problem addressed is to design the detector to be able to determine the presence of some or all 
of the specified set of input states given knowledge of the input set Din and the associated occurrence 
probabilities. Although the principal focus is on an equal number of state inputs and detector outcomes, 
this is not always the case , e.g., noisy measurements can result in unequal inputs and outcomes as briefly 
discussed in Section 5.1. 
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2.2 Performance probabilities 

Detector performance is usually assessed by examination of one or more of the following probability 
matrices: 

joint probability matrix Pjoint {h j) = Prob {detect i AND input j } 

conditional probability matrix Pout\in{i\j) = Prob {detect z GIVEN input j} (4) 

a probability matrix Pin|out(iK) = Prob {input j GIVEN detect 

As shown in any standard text [21] these probabilities are related as follows: 



Pjoint(«,j) = Pont\m{i\3)Pj = Pin\ont{j\i)Pont{i) 
Poutii) = E^l Pout\m{i\j)Pj 



(5) 



Without loss of generality we can order the input and output events so that detector event 1 corresponds 
to input event 1, detector event 2 to input event 2, and so on. With this ordering, detector performance 

can be assessed by the error probabilities: 

ejoint («) = Prob {detect i AND input j ^ i} = Pout (0 - Pjoint {i, i) 

econd(i) = Prob {detect j 7^ i GIVEN input i} = 1 - Pout|m(^K) (6) 

epost {i) = Prob {detect i GIVEN input j j^i} = 1 - Pin|out (^K) 

Observe that each of these is the sum of the off-diagonal elements of the corresponding probability 
matrices (4). Being error probabiUties, they all range from zero to one: 

ejoint(«), econd(^), epost(«) e[0,l], i = l,...,m (7) 

As we will see shortly, it is convenient to express each of the error probabilities in terms of Pout|in(^|j) 
and^?i. Using (5) gives, 

ejoint(i) = Y1T=1 Pout\mii\j) Pj - Pout\m{i\i) Pi 
econd(^) = 1 ~ Pout|iri('''l''') 

... _ , Pout, in ('TO Pi _ ^ j>oul.|m(^10 P> 

Pout(«) 2^j=l Pont\inWJ) Pj 

The expression for epost (0 is vaUd only if Poutii) / which is assumed. 
2.3 Perfect and unambiguous detection 

Perfect detection occurs when the detector reads i only if the zth input is present. Thus the detector is 
correct all the time. In this case the a posteriori probability matrix is identity which can only occur when 
the conditional probabihty matrix is identity, i.e., 

Pout|in(^li) = hj, hj = '^,---,-m (9) 

Under this condition, all the error probabiUties in (8) are simultaneously identically zero. As might also 
be expected, perfect performance is independent of the input distribution {pj}. For quantum systems, 
this is possible if and only if the input states are orthogonal [37]. 
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A weaker condition, referred to as unambiguous detection, occurs when the detector either provides 
the correct answer or one that is inconclusive with some probabihty (see, e.g., [12, 13]. This detector 
requires an additional outcome corresponding to the inconclusive result. There are now m + 1 detector 
outcomes, Dout = {i \ i = 0, . . . ,m }, where outcome means the result is inconclusive. As before, 
for i = 1, . . . , m, outcome i means that input i is declared to be present. For the detector to be correct 
when i, i = l,...,mis declared, the a posteriori probability of input i given outcome i must be 
1. Equivalently, the submatrix of the conditional probability matrix corresponding to the m states is 
diagonal but not necessarily identity, as in perfect detection. Thus (9) now becomes, 

Pout|in(^|j) =p(.i)Sij, ij = 'i-,---,m (10) 
Under this condition, the a posteriori error probabihty is, 

Pout I in jili) Pi ^ ^ _ p{i) P: 

Pout]in(«li) Pj P{i) P. 

for alH = 1, . . . , m, and the probability of an inconclusive result is, 

m 
i=l 

If the probability of an inconclusive result is non-zero, then this detector is a type of randomized detector. 
A detector designed without this feature will be referred to as a deterministic detector. We will return to 
the problem of designing an unambiguous and/or randomized detector in Section 5.2. 



post(i) = 1- ^ ' , :, = 1 - -^r^ = (11) 



2.4 Partial state detection 

It is often the case that not all the input states are to be detected. We will show that it is not necessary to 
have a detector outcome for all the states. Consider the input set, 

Din = {pj, Pi |« = l,...,^} (13) 

Suppose only the states pi,. . . ,pk, < £ are to be detected. The £ — k states that are not being detected 
can be lumped into one state, the statistical mixture, 

f: 

r= PiPi (14) 

i=k+l 

with occurrence probability Yli=k+iPi- Thus the set (13) of £ states can be replaced with the statistically 
equivalent set of + 1 < ^ states 

e 

Din = {(pi, pi), . . . Pfe), (r, X ^^^^ 

i=k+l 

The detector then only requires k + 1 outcomes, not £ outcomes. To adhere to the previous notation, e.g., 
(1), define m = k + I. 

An important application of the above procedure is detection of a single pure state. In this case the 
input state set, in the form of (15), becomes, 

Din = {(VV'*, 1-/3), {r, m (16) 

with the pure state tp G C", ip*ilj = 1 occurring with probabihty 1 — P and the remaining states 
represented by the mixed state r G C"^", r > 0, Tr r = 1 occurring with probability We use this 
example to illustrate the structure of the optimal detector in some cases. 
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2.5 Measures of performance 



The goal is to design the detector to minimize the size of an error probabihty. The size of the error is 
set by selecting a norm. Here we will consider two common norms referred to as average and worst- 
case. Since (7) holds - the errors are always non-negative - we can define the average error norm by 
ll^llavg ~ Si^i '^■i- ^(0 the worst-case norm by \\e\\^^ = maxi=i^...^m Wi e{i). These norms are 
weighted error probabilities: the weights, Wi > 0, are selected to emphasize specific outcomes - a larger 
weight emphasizes the desire to detect a particular state. Table 1 shows these norms for the specific error 
probabiUties (8). 



lavg 



Sjoint 



w, 



{Poutii) - Poutliniili) Pi) 



max Wi {pout{i) - Pout\in{i\i) Pi) 



f-cond 



^ Wi (1 -Pout|in(«K)) 



max Wi (1 -Pout|in(^K)) 



6post 



Pont\in{i\i) Pi 
Ej Pout|in(«|i) Pj 



max Wi 1 



Pout\in{i\i) Pi 

Ej Pout|in(«|i) Pj 



Table 1: Norms of error probabilities. 

In Table 1, the performance measures ||ejoint|lavg i l|econd|lavg ' l|ejointlLc ' llecondlLc are convex 
functions of the elements of the conditional probability matrix. In the next section we will show that 
the conditional probabilities are affine functions of the design parameters, specifically, the elements of 
the POVM characterizing the detector. Hence, these are convex functions of the design parameters. Of 
these measures of performance, only ||ejoint|lavg ' Ikcondllavg' slight variations thereof, have been 
addressed in the literature. Section 4 briefly describes the convex optimization problem associated with 
the performance measure ||ejoint llavg and is to some extent a partial review of known results, e.g., [17]. 

The performance measure llepostll^c' which is the focus of this paper, is a quasiconvex function 
of the conditional probabilities, and hence, a quasiconvex function of the design parameters, e.g., the 
POVM elements. As we will show in Section 5, the optimal design can be obtained by solving a convex 
optimization problem. 

The performance measure ||epost llavg neither a convex nor quasiconvex function, hence, only local 
solutions are guaranteed to be found numerically. 



3 Detector as a POVM 

We start with the assumption that the detector can be completely described by a POVM (positive operator 
valued measure) with matrix elements { Oj G C"^" | i = 1, . . . , m } which, by definition, satisfy,^ 

m 

Y^Oi = In, >0, z = l,...,m (17) 

1=1 

'The notation X > or X > means that X = X* and all the eigenvalues of X are, respectively, non-negative or stricfly 
positive. 
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In consequence, designing an optimal detector means selecting the matrices that form the POVM to 
minimize a selected performance measure from Table 1 . To this end we express the error probabilities 
in terms of the problem data {pi, pi} and the design variables {Oj}. First, the conditional probability of 
detecting i given state j is, 

Pout|in(i|i) = Tr Oipj (18) 
From (5), the total probability of detector event i is then, 

m m 
Pontii) = Yl Pont\in{i\j) Pj = ^ i^^ OiPj)pj = TV Oj p (19) 

where p is the statistical mixture of all the possible input states, 

m 

Throughout we make the assumption that, 

p>0 (21) 

This is a not a limiting condition; it is easily satisfied in most practical situations and if necessary can be 
overcome by restricting attention to the range space of p. 

The error probabilities in (8) can now be expressed as follows for i = 1, . . . ,m 

ejointW = T^Oi{p-piPi) 

6cond(^) ~ 1 OiPi (00\ 

_ ^ Pi Tr OiPi ^^^^ 
epostW - i ^^.^ 

Observe that epost(0 is meaningful only if Tr Oip > 0. Since Oj > from (17), it follows that if the 
mixed state p > 0, then Tr Oip = only when Oj = which is a pathological case. 

Using (22), the entries in Table 1 are given explicitly as shown in Table 2. The first observation to 
make is that the POVM matrices {Oi} form a convex set (17). As already stated, since ejoint(0 
econd(^) are affine functions of Oi, it follows that these errors are both convex functions of the Oi matri- 
ces. Further, since all norms are convex functions, the performance measures ||ejoint ll^vg ' Ikjoint and 
Ikcondllavg ' Ikcondllwc convcx functions of the POVM matrices. Again, we note that ||epost||^c 

is a quasiconvex function of {Oi} and ||epost|lavg convex. Therefore, minimizing any of these 

(quaisi)convex measures over the POVM matrices can be cast as a convex optimization problem. 



Optimality conditions 

Lagrange Duality Theory [5, Ch.5] provides a means for establishing a lower bound on the optimal 
objective value, establishing conditions of optimality, and providing, in some cases, a more efficient 
means to numerically solve the original problem. In the sections to follow we will present the optimality 
conditions in a form which involves only the problem data, pi, pi, i = 1, . . . ,m, and the design variables, 
the POVM matrices, Oi, i = I,. . . ,m. The details are presented in the Appendix. The optimality 
conditions can also be used for determining if a known POVM set is optimal, what the authors in [1] 
call: "testing an Ansatz." Such a POVM could be obtained from some analytic means, from data, or 
from imagination. 
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iPllavg 


\\p\\ 


^joint 


w-i (Tr O-i (o — vwA) 

i 


mri,x ?/?^'TV Oa if) — haOa^ 

i.LLo^ ixji j-r J/tyi J 

i 


^cond 


i 


max Wi {1 — Tr Oipi) 

i 


^post 


i 


( piTrOipA 




max Wi 
i 




K Tr Oip ) 


^ TrOip ) 



Table 2: Norms of error probabilities as functions of POVM elements. 



Implementation of a POVM 

As shown in [34, §2.2.8], any POVM can be implemented by a unitary matrix in an expanded space 
together with rank-one projective measurements in the natural basis on the ancilla outputs. Some general 
implementations of a POVM are also presented in [32]. Realizing the resulting unitary and rank-one 
projections with specific physical components is, in general, a more difficult problem. 



4 Optimal average joint performance 

In this section we briefly discuss optimal detector design for ||ejoint||avg- "^^^^ problem, with slight vari- 
ations, has been essentially completely analyzed in [17]. The presentation here is primarily to illustrate 
a few of the ideas which repeatedly occur. Following this, the main focus of this paper, presented in 
Section 5, is on detector design for ||epost|lwc' optimal worst-case a posteriori design. 

A detector which minimizes the objective ||ejoint|lavg ™ Table 2 is obtained by solving the following 
optimization problem for the POVM matrices {Oj}: 

minimize ||ejoint|lavg = YT=\ (^r Oi {p - piPi)^ ^^3) 
subject to YlT^i Oi = In, Oi > 0, i = I, . . . ,m 

This problem was addressed in [17] for equal weights, Wi = 1, where the objective becomes 1 — 
J2i PiTr OiPi. As observed in [17], problem (23), with or without equal weights, is a semidefinite 
program (SDP) [5, §4.6.2]. An SDP is a generaUzation of Unear programming where the Unear inequal- 
ities are replaced with matrix inequalities. Although it does not make any physical sense, if the {Oj} are 
constrained to be diagonal, then the problem reduces to a linear programming problem. 



Optimality conditions 

As derived in Appendix A.l, any feasible POVM, i.e., any set of Oi G C"^", i = 1, . . . ,m which 
satisfy (17), is optimal for problem (23) if and only if. 



(A-Er=i AjOj)o, = 0, i = l,..., 



(24) 

m 
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with all the problem data in the matrices, 

Ai = wi{p- piPi) , i = 1, . . . , m 



(25) 



Two state detection 

As an apphcation consider the two state detection problem using the state set (16). For equal weights 
wi = W2 = I, the data matrices are, 

Ai = p - (1 - /3)V'V'* = Pr 
A2 = p - (3r = {1 - P)^pr 

Using Oi + O2 = I, the optimality conditions become. 



(26) 



AO2 > AO2O1 = 
AOi < AO1O2 = 



with 



A = Ai-A2 
Since A is Hermitian it can be decomposed as, 

A = [U+ 



Pr-{1- (3)iljtp* 



" Q+ ■ 






n_ 




u* 



(27) 



(28) 



(29) 



where U = [C/+ f7_] £ C"^" is unitary and (17+ > 0, < 0) are diagonal matrices consisting, 
respectively, of the positive and negative eigenvalues of A. Make the choice. 



(30) 



This is a feasible POVM set because U is unitary. This choice also satisfies the optimality conditions 
(27). Specifically, AOi = U-^-Ul < 0, AO2 = U+^+U^ > 0, and because unitary U requires 
= 0, it follows that O1O2 = U-UIU-^U^ = 0. After some algebra, the optimal objective value 
is found to be. 



"joint 



avg 



= Tr(OiAi + O2A2) = /? - Tr Oh 



(31) 



As a further illustration, assume that r is completely randomized, that is, r = In/n. In this case U can be 
chosen such that the pure state in refeqomin pure has the decomposition il^tp* = U diag(0, . . . , 0, 1) U*. 
It then follows that: 

A = diag(^/n,...,/3/n,-l + /?(l + l/n)) 
p = diag(/3/n,...,/3/n,l-^(l-l/n)) 
n+ = diag(/3/n,...,/3/n) 

n. = -l + /3(l + l/n) 

Observe that 0_ < if and only if /? < n/(l + n). In general, as we show below, the assumption that 
A has both positive and negative eigenvalues places a limit on the size of 

Since 0+ is n — 1 x n — 1, we get Tr Cl^ = P{n — l)/n, and hence the objective value becomes 



(32) 



"joint 



avg 



f3/n 



(33) 



For this detector the corresponding a posteriori probabilities are 

(1 - /3)Tr Oii^i)* 



Pin|out(l|l) = 
Pin|out(2|2) = 



1-/3 



f3Tr Oar (n - 



(34) 



TVO2P 



n-P 
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This result shows that as n ^ oo, ||ejoint||avg ^ 0, Pin|out(l|l) ^ 1' andpin|out(2|2) (3. Thus, if the 
state dimension is large and the statistical mixture of the residual states tends to average out to a random 
distribution over all states, then the probability of detecting a single pure state is very high. 



Restrictions on /? From (28), A has non-negative eigenvalues (A > 0) only if /3r > (1 — /3)'(/'V'*' or 
equivalently, if, 

> P ip*r^^'ilj ^^^^ 
~ 1 + tp*r^'^ip 

Hence, for P < Pq, A will have both positive and negative eigenvalues. If as in the above example, 
r = In/n, then ijj*r~^ilj = n and thus Pq = n/(l + n). 

Suppose that A> {){P > Pq). Then the only way to satisfy the optimality conditions (27) is to set 



Oi=0, 02 = In 



(36) 



The optimal objective value is now 



^opt 
'joint 



avg 



Tr{A2 



1-P 



(37) 



This is just the occurrence probability of the pure state "0; essentially the detector does nothing. Observe 
that similar remarks can be made when ^4 < 0. 



5 Optimal worst-case a posteriori design 

In this section the detector is designed to minimize the objective ||epost||„(. in Table 2. This requires 
solving the following optimization problem for the POVM matrices {Oi}: 



... piTrOiPi 
mmimize IICpostlL-- = niax Wi I 1 



I wc 



i=l,...,m \ Tr Oip 

subject to Yl'iLi Oi = In, Oi>0,i = l,...,m 
Tr Oip > 0, 1 = 1,... ,m 



(38) 



As shown in [5, §4.3.2], the objective function, ||epost||„c' ^ maximum over a set of quasiconvex 
functions each with domain Tr O^p > 0, Mi, and hence, is a quasiconvex function over the domain 
{TrOi/9>0 \ i = l,...,m}. Since the POVM matrices {Oj} form a convex set, (38) is a quasiconvex 
optimization problem in the POVM matrices. Technically this means that for any positive scalar 5, the 
sublevel sets of POVMs 

{ 0„ Tr 0,p > 0, Vi I llepostlUe < ^ ) ^9) 

are convex. To see that these sets are convex in this case, observe that for POVMs in the domain 
Tr Oip > 0, Vi, the sublevel sets are equivalently, 

Tr OiAi{6) < (40) 

with 

Ai{d) = {wi - 5)p - WiPiPi, i = l,... ,m (41) 

The sets defined by (40) are affine in the POVM elements, and hence, are convex sets. We will refer to 
the matrices Ai (6) as the data matrices. 
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This effectively shows (see also Appendix A.2) that (38) is equivalent to, 
minimize 6 

subject to X^i^i Oi = In, Oi>0,i = l,...,m (42) 
Tr OiAi{S) <0, i = l,...,m 

The optimization variables for (42) are now the real positive scalar 6 as well as the POVM matrices {Oi}. 
Observe that (42) does not include the constraint set Tr Ojp > 0, i = 1, . . . , m. Since p > 0, the only 
way this constraint can be violated is if a POVM element is zero. If this occurs then the problem is ill- 
posed and most hkely that POVM element can be ehminated. Hence, from now on we do not exphcitly 
state this constraint. 

As shown in [5, §4.2.5] and described in Appendix A.2, a solution to the quasiconvex optimization 
problem (38) or (42) can be obtained by solving a series of convex feasibility problems together with a 
bisection method. 



Optimality conditions 



As derived in Appendix A.2, any feasible POVM (17) is optimal if and only if there exist real constants 

(5*^P* and Aj, i = 1, . . . ,m such that. 



\i > 0, i e S 
\i = 0, i^S 

XiA{s°p') - Er=i XjMs°p')Oj > o,i = i,..., 



(43) 



m 



, m 



The index set S consists only of those indices where the optimal (5°p* in (42) is achieved. Thus S is 
equivalently expressed by, 

S = {i = l,...,m\TrOiAi{S°P^) = 0} 

(44) 



= < i = 1, . . . ,m 



(5°Pt =Wi{l- 



Pi Tr Oj Pi 
Tr Oip 



Some special cases follow. 



Equal weights: one active linear constraint 

For equal weights, Wi = 1, Vz, (38) can be expressed equivalently by, 
maximize 7 

subject to Pi„|out(«K) = '^'r^o'p^' i = l,...,m (45) 
Y:T=i Oi = I„, Oi>0, i = l,...,m 

Clearly 7 = 1 — (5 with S from (42). Let 7°p* denote the optimal objective value in (45). Suppose only 
one linear constraint is active, that is, for i = k,Tr OkAkij"^^) = and for k,Tr OiAj(7°P*) < 0. 
Then, as shown in Appendix A.2, the optimality conditions (43) become, 

Ak{7°^') (I ~ Ok) > 

Aki-r')Ok < 

^fe(7°P*)U-0fc)0fc = (46) 
Aki^°P')OkOi = 0,i^k 
TV OfeAfc(7°Pt) = 
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with 7°P* given by, 

7°P* = Pk CTmax(p-'/'pfep-'/') (47) 

where crmax( ) is the maximum singular value of the matrix argument. (Note that p~^/'^ exists because 
p > is assumed (21)). Since only one constraint is assumed active, (47) is equivalent to, 

°Pt = min r.. 1/2^.^-1/2) (48) 



7"''^ = mm Pi (Tmax 



If the input states are pure, that is. 

Pi = A'il^i, i = l,---,m (49) 
with V' € C*, tp*ip = 1, then (47) becomes, 

7°P*= min Pi^*p-^A (50) 

j=l,...,m 

Single pure state detection 

Consider again the input set (16) where the goal is to detect ^. With the weights set to wi = 1, W2 = 0, 
the data matrices are, 

^1(7) = 7P - (1 - ..1^ 
A2{5) = -dp ^^'^ 

with 7 = 1-5. Unless 7°p* = 1 ((5°p* = 0), it foUws that Tr 02A2{6°p^) = -(5°P*Tr O2P < 0. 
Thus, the only active constraint is Tr OiAi(7°p*) = which makes the index set S the singleton 
S = {1}, and hence, (46)-(47) apphes. Using the Matrix Inversion Lemma to compute ilj*p~^'tp with 
p= {1 - /3)ipip* + Pr gives, 

7»P' = (l-fflrp-S>=^-^^(^i^iL_ (52) 

Observe that 7°p* increases as ip*r~^ip increases. If ip is close to a singular vector of r which has a 
very small singular value, then ■ij.fr^'^ip will be large, and hence, 7°p* « 1. This can be construed as an 
approximate orthogonality condition. In the special case when r = then ip*r~^ip = n, and hence, 

^opt ^ ^ ~ ^ (53) 

^ 1 - - 1/n) 

This is exactly the result in (34), which in general is not to be expected. 

The (two) POVM elements Oi and O2 associated with the two state input set (16) can be directly 
calculated from the optimality conditions (46). Using the fact that O1+O2 = I, the optimality conditions 
(46) become: 

^i(7°P*)O2>0 ^i(7°P*)020i = 

Ai(7°Pt)0i < ^i(7°Pt)Oi02 = ^ ^ 

Observe that because A2 = 0, the data matrix A2((^°p*) plays no part in the optimality conditions. Using 
7°P* from (52) makes rank yli(7°P*) = n — 1, and hence has the decomposition. 



^i(7°P') = 7°PV - (1 - P)i^r = [U+ Uo] 
ri+ = diag(a;i, . . .,ujn-i), uji>u;2>-- - I 



^+ 0„_i 





" ui ' 




TI* 


> 



(55) 



13 



for unitary [U+ Uq] G C"^" with U+ G C"^"-^ and C/q G C"^^ Setting, 

Oi = UoU^, 02 = U+Ul (56) 

gives ^i(7°P*)02 = f/+f^+f/+ > 0, Ai(7°P*)0i = 0, O1O2 = 0, thus satisfying the optimaUty 
conditions (54). Observe also that 0\ is a rank 1 projector, and O2 is a rank n — 1 projector. 

If r = In/n, then the a posteriori probabilities are exactly the same as given by (34); again, tiiis is 
not the case in general. 



Single state detection witli pure residual state 



In the previous example, as long as the residual state r > 0, then it it is not possible to make 7°p* = 1. 
To see this, observe that yli(7°P* = \) = p — {1 — (3)ipip* = /3r > 0, and hence has only positive 
eigenvalues. Thus, the optimality conditions can only be satisfied with Oi = 0, O2 = I- (Effectively Uq 
in (55) is null.) This choice of 7°^* is therefore infeasible. 
Now consider the input set. 



Din = {(po, 1-/?), 



(57) 



with the pure residual state cf) e C"- , 4>* 4> = I occurring with probability P and the state to be detected 
po G C"^", p>0 occurring with probability 1 — /3. In this case for 7°p* = 1, we get. 



^i(7°Pt 



1 - P)PO 

' = P[U+ Uo] 



1 o^_i 






On— 1 On— Ixn— 1 




TI* 



(58) 



with U+ e C"^\ Uo G C"^"-i. The choice Oi = UqU^, O2 = U+U;. satisfies the optimality 
conditions. Hence, perfect deterministic detection of a single state, pure or mixed, is possible if the 
residual state is pure. 



5.1 Noisy measurements 

The optimal detector design problem can be modified to handle a "noisy" set of measurements. In general 
there can be more noisy measurements than noise-free measurements. Consider, for example, a photon 
detection device with two photon-counting detectors. If both are noise-free, meaning, perfect efficiency 
and no dark count probability, then, provided one photon is always present at the input of the device, 
there only two possible outcomes: {10, 01}. If, however, each detector is noisy, then either or both 
detectors can misfire or fire even with a photon always present at the input. Thus in the noisy case there 
are four possible outcomes: {10, 01, 11, 00}. 

As before, let {Oi} denote the m noise-free POVM matrices. Now let {O"™^^} denote the m noisy 
measurements with m > m. The noisy measurements can be expressed as, 

m 

Of^y = Y^UijOj,i = l,...,7h (59) 

i=i 

The {ui} represents the noise in the measurement, specifically, the conditional probability that i is mea- 
sured given the noise-free outcome j. Since Yl^i ^ij = 1; ^J' follows that the noisy set {0"°'^^} is 
also a POVM. Thus, 

rh 

^ 0--y = I,, >0,z = l,...,m (60) 

1=1 
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In matrix form, 







/-jnoisy 
- m 





Z^ll In 



''mm In 









. Om _ 



(61) 



When the equivalent noisy POVM matrices, {O^™^^}, are inserted into (38), either objective function 

retains the same form with the {0°°'**^} replacing the {Oj}. The design variables are still the noise-free 
POVM matrices {Oj}. Since the noisy POVM matrices, {0°™''^}, are linear in the noise-free POVM 
matrices, {Oi}, the design problems in Table 2 remain convex or quasiconvex optimization problems 
over the noise-free POVM matrices {Oi}. 



Optimal worst-case a posteriori performance with noisy measurements 

With noisy measurements, (38) becomes, 

/ Pi Tr Or^y Pi 
mimmize \KoA^. = ^^^^^ ^^ ' Tr Of'^ p 

subject to 0^°""^ = Y.f=i ^ij Oj, i = l,...,rh 
YZi Oi = In, Oi>0, i = l,...,m 

Under the assumption that Tr 0'^°^^p > 0, Vz, (62) is equivalent to, 

minimize S 

subject to 0°°'^y = YlT=i ^ijOj, 1 = 1,... ,m 



(62) 



Ya=i Oi = In, Oi>0,i = l,...,m 
Tr 0"™'Mi((5) < ^, i = l,...,m 



(63) 



The optimization variables for (62) are now the real positive scalar S as well as the noise-free POVM 
matrices {Oi}. The data matrices, Ai{6), are given by (41). 

Optimality conditions 

As derived in the Appendix A.3, any feasible POVM (17) is optimal if and only if there exist real con- 
stants Xi, i = 1, ... ,171 such that, 

Xi > 0, i e S 

Xi = 0, i^S 

XiM5°'P\ u) - ZT=i A,-4j(5°P*, i^)Oj > 0,i = l, 

[XiAi{d°P\ u) - E7=i XjA,{5-^\v)0,) O, = 0, i = 1, 

Elli Tr Ai{6°^\u)Oi = 



, m 
, m 



(64) 



, m 



with 

in 

and the index set S given by, 

5={i = l,...,m|Tr OiAi(5°PS z^) = } 
where is the optimal objective value from (63). 



(65) 



(66) 
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Single pure state detection 

Consider again the input set (16) with weights wi = 1, u>2 = 0. Suppose the measurement noise matrix 
is, 

1 - f 1^0 
f 1-/^0 



(67) 



Assuming the only active constraint to be Tr O^"^^^ Ai{6) = 0, then the multipliers are Ai = 1, A2 = 
and the optimality conditions become: 

(1 - 2U0)A,{jZly)O2 > A^{jZly)020^ = 

(1 - 2uo)A^{^Z^y)Ol < A,{jXsy)0l02 = (gg) 



with ^1(7) from (51). Assume further that vq < 1/2. Then the matrix inequalities in (68) are the same 
as in (54), namely, 

^i(7n^;;y)02 > 0, A,{jZ1^^)O,<0 (69) 
Now again introduce the decomposition. 



' n+ " 




" [/* " 


Q_ 




Ul 



(70) 



where (0+, Q-) are diagonal matrices consisting, respectively, of the positive and negative eigenvalues 
of ^i(7noisy)- previous examples, make the choice, Oi = U-Ut, O2 = ^+^+ The matrix 

inequalities and equalities in the optimaUty conditions are satisfied by this choice. The scalar (trace) 
condition is satisfied provided that, 

Tr n_ = ^Tr ^^+ (71) 

1 - fo 

The noise-free case, = 0, requires that Tr r2_ = 0, which means that fi- = 0. This is the condition 
for the decomposition in (55) which can only occur for 7°ofsy = 7°^* from (52). When noise is present, 
uq > 0, it is necessary that < 0, and hence, 7°o?sy < 7°^* as might be expected; noise reduces the a 
posteriori probability of detection. 

To illustrate this further, suppose again that r = In/n and we use the decomposition = 
U diag(0, . . . , 0, 1) U*. This gives, 

^+ = il^^yPM In-1, n_ = 7„T;sy (1 " Pi^ " ^M) " (1 " Z?) (72) 



Consequently, (71) holds if 



l—uo n 

This clearly shows that 7°^*^^ < 7°?* = (1 - - - 1/n)). In addition ,as n ^ 00, 



^°Pt = I ^ (73) 

'noisy / 1 „ ^ iN ^ ^ 

1 -/3 ( 1 - i - -J^n=l 



When vq > 1/2, the matrix inequalities in (68) reverse and become. 



^°Pt ^ I ^ (74) 



^i(7n^;;y)02 < 0, Ai(7°PUOi>0 (75) 
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These are satisfied if the POVM matrices for fo < 1/2 are also reversed, that is, set to Oi = C/+?7^, O2 = 
U-U^ with {U+,U-) from the decomposition (70). For a fixed /?, the optimal objective value with 
uq < 1/2 will always be greater than the value with uq > 1/2. When uq = 1/2, this type of detector can 
do no better than 7 = 1 — /?, the occurrence probability for the pure state. 



5.2 Unambiguous detection 

As discussed briefly in Section 2.3, an unambiguous detector is one that with some probability either 
detects the correct state or else declares the result inconclusive. This requires an additional POVM 
element to account for the inconclusive result. Specifically, as before, let Oj, i = 1, . . . , m correspond 
to the m input states pi, i = 1, . . . , m and let Oq correspond to the inconclusive result. Thus there are 
m + 1 POVM elements, Oi, i = 0, . . . ,m. The probabihty of an inconclusive result is therefore, 

Pinci = TV Oop (76) 

The ideal unambiguous detector is one where the a posteriori probability error is zero, or equivalently 
j3in|out(^K) = 1' i = 1, . . . ,m which can only occur when (10) holds which here becomes, 

Pout|m(^li) = Tr OiPj = p{i) dij, i,j = l,...,m (77) 

Observe that if p{i) = 1, i = 1, . . . ,m then from (12) pind = 0, and hence, Oq = 0, which eliminates 
the need for the extra (inconclusive) detector outcome and reduces to the condition for perfect detection 
(9). Allowing for a non-zero probability of an inconclusive result opens the possibility that a detector 
can be designed to satisfy (1 1), and thus, Cpost (0 = Oj i = 1, . . . , m. 



minimize ||epost||„c = ' ^ 



Optimal a posteriori performance with an inconclusive outcome 

By relaxing the requirement for zero error we can find a randomized detector by solving the following 
problem. 

PiTrOiPi \ 

rC:,™ Tr Ojp ) (78) 

subject to YT=o Oi = In, Oi>0, i = 0,...,m 

If II Cpost||„(, — then we have found an unambiguous detector, one that either produces the correct 
result or is inconclusive. Otherwise the detector is randomized, but not unambiguous. However, the 
extra design freedom in the inconclusive POVM matrix, Oq, insures that the resulting ||epost|lwc ^^^^ 
always be smaller than the value obtained for a detector without the additional inconclusive outcome. 



Optimality conditions 

Observe also that the only difference between problem (78) and problem (38) is the extra POVM element 
Oq. As a result the optimality conditions have extra constraints to account for the additional element. 
Specifically, any feasible POVM is optimal if and only if there exist real constants Aj, i = l,...,m such 
tiiat, 

Xi > 0, i e S 
Xi = 0, i^S 

A,A(<5°Pt) - Ef=i A,-^,(<5°Pt)0,- > 0,i = l,...,m ^^^^ 
(Ai A(<5°P*) - E7=i Ai^,(<5°P*)0,) 0^ = 0, i = 1, . . . , m 
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The index set S consists only of those indices where the optimal objective value from (78) is achieved. 
Thus, 

S = {i = l,...,m\TrOiAi{6°^^) = 0} (80) 

If the optimal is achieved at only one constraint, say i = k, then = 1 Xi = 0, i ^ k, and the optimahty 
conditions become: 

^fc(,5°Pt)(/-Ofc) > 

^fe(<5°P*)0fc < 
Afc(<5°Pt)(/-Ofc)Ofc = (81) 
(^fe(<5°P*)0fc)0, = 0, i£{l,...,m}^k 
Ak{6°P')OkOo = 

Optimal a posteriori performance with an inconclusive outcome and measurement noise 

Problem (78) can be modified to account for measurement noise. 



mimnuze 



--post I 



max Wi 1 

i=l,...,m \ 



PiTrO 



noisy 



Pi 



subject to 0"°'*'-*' = ^]Lo ^ij Oj, i = 0,...,m 
Y:T=oOi = In, Oi>(}, i = 0,...,m 



(82) 



In this case because of the noise, it is doubtful that an unambiguous detector can be found. Nonetheless, 
the resulting randomized detector will still outperform one without an inconclusive outcome. 

5.3 Example 

Consider the following two (pure) input states and corresponding occurrence probabilities: 

T 



Pi 







' 1/V2' 


T 


' 1 ' 




" 1 " 


1/V2 _ 




. VV2 . 


P2 = 











(83) 



Pin(l) = 2/3 Pin(2) = 1/3 

Throughout this example we place equal weights on each state, 

ki,^2] = [l, 1] (84) 
Optimizing the worst-case a posteriori probability measure, (38), returns a posteriori probabilities^ 

Pin|out(l|l) =0.87 Pi„|out(2|2) = 0.87 (85) 
and POVM matrices which are well approximated by the rank-one projectors^ 



0.53 
0.85 



-0.85 
0.53 



(86) 



Optimizing the worst-case a posteriori probabihty measure with the additional inconclusive outcome, 
(78), for the bound pinci < 1, returns an unambiguous detector with a posteriori and inconclusive prob- 
abilities, 

Pin|out(l|l) = 1 Pi„|out(2|2) = 1 Pincl = 0.75 (87) 

^All numerical results were obtained using SEDUMI [42]. The numbers shown are rounded to two significant digits. 
'The positive semi-definite matrices returned by the convex program are approximated by rank-one projectors using a 
singular value decomposition only when the maximum singular value is much greater than all the others. 
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The associated POVM matrices (rank-one projectors) are 




0.62 



-0.62 
0.62 



0.79 
0.49 



(88) 



This is an unambiguous detector which is perfectly correct 75% of the time. 
Now we add 2% noise and solve (62) with 



V = 



0.98 
0.02 



0.02 
0.98 



By comparison with (85)-(86) we now get 

Pin|out(l|l) = 0- 
and similar POVM rank-one projectors 



Pin|out(2|2) = 0.86 



0.55 
0.83 



Solving (82) with a similar 2% noise 



Z^incl 



0.98 
0.01 
0.01 



-0.83 
0.55 



0.01 0.01 
0.98 .01 
0.01 0.98 



(89) 



(90) 



(91) 



gives the probabilities, 

Pin|out(l|l) = 0.96 Pi„|out(2|2) = 0.96 pinci = 0.76 
and POVM matrices which are well approximated by the rank-one projectors, 



0.04 
0.46 



-0.68 
0.66 



0.73 
0.59 



(92) 



(93) 



(94) 



This is no longer an unambiguous detector but rather a randomized detector. For 76% of the time an 
inconclusive result will occur. When the detector declares either state 1 or state 2, the probability of 
being correct is 96% which is better than the deterministic detector with probabilities of 86%. If the 
situation is such that there is Uttle penalty in waiting, then a higher probability outcome is promised by 
the randomized detector. 

We now repeat all the above optimal designs for varying noise levels: 



I-Uq Uq 



1 - i/Q i^o/2 i^o/2 
i/o/2 1- vo i^o/2 
i^o/2 i^o/2 1 - 



, 1^0 G [0, 0.20] (95) 



The results are plotted in Figure 2 for i/q from to 0.20 in 0.02 increments. The sohd curves are the two 
diagonal elements of the a posteriori probability matrix for the optimal randomized detector. Associated 
with them is the dotted curve showing Pinch the probability of an inconclusive result. The dashed curves 
are the two diagonal elements of the a posteriori probabihty matrix for the optimal deterministic detector. 
As expected, the randomized detector outperforms the deterministic detector as seen by the fact that 
the lower solid curve is always larger than the lower dashed curve. (The optimal worst-case design 
maximizes the minimum error, which is equivalent to making the lower of the two curves as large as 
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possible.) In all cases the POVMs were easily approximated by rank-one projectors, but in no case were 
the projectors in the natural basis. 

The behavior of pinci(i^o) is quite interesting. The inconclusive probability and the associated POVM 
matrix become small at a noise level t] w 0.12, in effect, turning off the randomized feature. 

Figure 3 shows the robustness properties of the randomized and deterministic detectors. We fixed 
the POVMs for the two cases at their optimal settings corresponding to the noise-free case {vq = 0). 
The plots show what happens as the noise level increases. The probability levels are not all that different 
from the optimal noisy results in Figure 2, but are of course not as good. 



i = 1, 2 




0.6 



0.5 



0.4 



0.3 



0.2 



0.1 



deterministic 



Pincl 



0.02 



0.04 



0.06 



0.08 0.1 0.12 0.14 

noise level (fo) 



0.16 



0.18 



0.2 



Figure 2: Pin|out(^K)i ^ = 1,2 optimized for each noise level via (38) and (78). 
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6 Extensions and Other Considerations 



6.1 Uncertain dynamics 

The goal is to design the POVM {Oi} in the presence of uncertain detector dynamics Q G Ddyn as 
illustrated in Figure 4. 



Uncertain Dynamics 




POVM 


Q € Ddyn 




m 



d G Dout 



Figure 4: Detector with uncertain dynamics. 

We will assume that Ddyn consists of a finite number of unitary operators {C/fe} with corresponding 
occurrence probabiUties {pdyn{k)}. Thus, 



Ddyn = {C/fceC"X" |fc = l,...,n 

Pdyn(fc) = Prob{Q = C/fe} 



(96) 



The conditional probability (18) now becomes, 

Pout\mii\j) = Tr OiPj , pj 



^ Pdyn{k)UkpjUl 



k=\ 



(97) 



This clearly shows that the only changes to make is to replace pj with pj everywhere, specifically, in the 
error probabilities (22) and in the output state p as defined by (20). 

The above representation of Q is an example of a the more generic Kraus operator sum representation 
(OSR). Specifically, the Kraus matrices, {K^ ^ Qnxn | = . . . ^ ^ } with £ < n^, can characterize a 
large class of possibilities for the Q-system as follows: 



(98) 



k=l 



k=l 



Comparing this with (96) gives Kk = \/pdyn{k)Uk and Kq = In, which clearly is just one possibility. 
For example, when Kq < In, additional measurement operations within Q are included. The OSR also 
accounts for many forms of error sources as well as decoherence, e.g., [34], [31]. 



6.2 Detector with fixed POVM 

In this section we consider designing the detector for a fixed POVM set. We will show that the detector 
dynamics when represented as an OSR (Operator-Sum-Representation) can also be designed by solving 

a quasiconvex optimization problem. 

Suppose we are given the POVM, {Oi}, and wish to design Q for optimal detection as shown in 
Figure 5. 



Di, 



Detector 




Fixed POVM 


Q 




m 



den 



out 



Figure 5: Detector with fixed POVM. 
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The POVM would be most likely selected as rank-one projectors in the natural basis. For example, for 
i = 1, . . . , m, fix 6j G C"* and Oi = Ii® hi h* with £ + m = n, the dimension of the input state. The 
input state might also consist of prepared ancilla states. In the natural basis 6f = [1 • • • 0], 62 = 
[0 10 ••• 0],...,5^ = [0 ••• 1]. 

As noted in Section 6.1, a very general form to characterize Q is the Krause OSR. Using (98), the a 
posteriori performance probabiUty is now, 

- TrOiQ{p,K) ^^^^ 

which is quadratic (fractional) in the ICraus matrices. It can be transformed into a quasiconvex function 
by expanding the Kraus matrices in a fixed basis. The procedure, described in [34, §8.4.2], is as follows: 
since any matrix in C"^^"^ can be represented by complex numbers, let 

(S^gC"^'* |/x=l,...,n2} (100) 

be a basis for matrices in C"^". The Kraus matrices can thus be expressed as, 

Kk = Y,ak^B^,k = l,...,i (101) 

where the coefficients {afc^} are complex scalars. As shown in [34] the representation (98) now 
becomes, 

2 2 

Q{p,X)= ^ X^^B^pBl, X^,B;B,<In (102) 



with 



k=l 



2 2 

The matrix X G C" ^" with the above coefficients must also be non-negative in order to maintain 
probabihties. The number of free (real) variables in X is thus — n^. In addition, we can write, 

Pont{i) = TrO^Qip,X)=TrXRi{p), i = l,...,m (104) 

2 2 

where the matrix Ri{p) G C" ^" has elements given by, 

[Ri{p)U = 1^ B,pB;Oi, p,u=l,...,n'' (105) 
The problem of optimally designing the "system" part of the detector, the Q-system, is equivalent to the 

2 2 

following optimization problem over the positive semidefinite matrix X G C" ^" . 

... piTrXR,{p, 
minumze IICpostlL™ = Wi [ 1 



i=i,...,m ' V Tr XRi{p) J (106) 
subject to ^^B, < 7„, X>0 

This problem, like (38), is also a quasiconvex optimization problem with the optimization variables being 
the elements of the matrix X. 
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Implementation of OSR 



An OSR can be implemented using unitary operations (and if necessary projection measurements) and 
the X-matrix can be transformed to Kraus operators via the singular value decomposition [34]. Specif- 

2 2 

ically, let X = VSV* with unitary V G C" ^" and S = diag(si • • • s„2) with the singular values 
ordered so that si > S2 > • • • > s„2 > 0. Then the coefficients in the basis representation of the Kraus 
matrices (101) are, 

ak^ = VskV*f„ k,ii = l,...,n'^ (107) 
Theoretically there can be fewer then Kraus operators. For example, if the Q system is unitary, then, 

Qip) = UpU* (108) 

In effect, there is one Kraus operator, U, which is unitary and of the same dimension as the input state 
p. The corresponding X matrix is a dyad, hence rank X = 1. Adding a rank constraint would thus 
force a simpUfication of the implementation. Unfortunately, a rank constraint is not convex. However, 
the X matrix is symmetric and positive semidefinite, hence the heuristic from [19] apphes where the 
rank constraint is replaced by the trace constraint, 

TrX<r] (109) 

From the singular value decomposition of X, X = s^. Adding the constraint (109) to (106) 
will force some (or many) of the Sk to be small which can be eliminated (post-optimization) thereby 
reducing the rank. The auxiUary parameter 77 can be used to find a tradeoff between simpler reaUzations 
and performance. 
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A Optimality Conditions 

Optimality conditions are derived from Lagrange Duality Theory for the following detection criteria: (i) 
average joint performance, (ii) worst-case a posteriori performance with noise-free measurements, and 
(iii) worst-case a posteriori performance with noisy measurements. 

Caveat emptor The material in this section is meant to be a "scaffold" to what can be found in some 
of the recent texts on convex optimization, e.g., see [5] and the references therein. More specifically, 
we refer principally to the sections in [5] where detailed information and proofs can be found for any 
axiomatic statements made here. The same caution applies to our references to computational methods: 
interested readers should refer directly to the available convex solvers which can be downloaded from 
the web, e.g., SDPSOL [43] or sedumi [42]. 

A.l Optimality conditions for average joint performance 

We will apply Lagrange Duality Theory [5, Ch.5] to the optimization problem (23) referred to in this 
context as the primal problem. The Lagrange function associated with the primal problem (23) is, 

m m 

L{0, Z,Y) = ^Tr OiAi - Tr ZiOi + Tr y (l„ - ^ O^) (110) 

i=i 1=1 
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with Lagrange multipliers G C"^", Zi > for the inequaUty constraint Oi > 0, and Y G 
j^nxn^ Y = yT fQj. equality constraint El^i ~ "^^^ ^^^^ term in L(0, Z, Y) is the objective 
function in (23) expressed in terms of the data matrices Ai from (25). The Lagrange dual function is 
defined as,, 

g{Z,Y) = ini L{0,Z,Y) 

Try Ai- Zi-Y = 0, i = l,...,m (HI) 
— oo otherwise 

One of the important properties of the dual function is that for any Zi > and any Y, we get the lower 
bound, 

5(Z,y)<<5°P* (112) 

where 5°p* is the optimal objective value from solving (23). The Lagrange dual problem establishes the 
largest lower bound from, 

maximize g{Z,Y) 

subject to Zi > 0, i = 1, . . . ,m 

where the optimization variables are {Z, Y). Using (111) we can ehminate the Zi variables and write the 
dual problem explicitly in terms of the Y variables as, 

maximize Try 

subject to Ai — Y > 0, i = 1, . . . ,m 

A solution, y°pt, the dual optimal multiplier, also returns the maximum objective value, d°P* = Tr y°pt, 
the dual optimal value. From (1 12) we get < S°^^. A numerical solution of the primal problem (23) 
always returns S > (5°p*, and likewise numerically solving the dual problem (114) will always return 
d < (i°P*. Thus, the optimal solution is always contained in the known interval d < (i°P' < ,s"p* < s 
. For this primal-dual pair we also have strong duality, that is, d°P* = s°p*. This follows because the 
primal problem satisfies Slater's condition [5, §5.2.3], which in this case means that the primal problem 
is convex and there exist strictly feasible {Oi), i.e., Oj > 0, i = 1, . . . ,m, EI^i — ^n- (For 
example, let Oi = In/m). The optimal and computed objective values then satisfy, 

d < Tr y°P* = (5°P* < ^ (115) 

Strong duaUty also impUes the following complementary slackness conditions, [5, §5.5.2], 

The last line uses Zi = Ai-Y from (1 11). Combining YT=i = ^ with {Ai - yopt)0°P* = 0, z = 
1, . . . , m gives, 

m 
i=l 

This can be used in to eliminate y°P* in (114) and (116) yielding the constraints, 

-E7=i AjOj > 0, i = l,...,m 



(a - E7=i AjOj^ Oi = 0, i = 1, . . . , m 



(118) 



These are the conditions stated in (24) as being necessary and sufficient for optimality of any feasible 
POVM set {Oi}. The proof of this statement reUes on the fact that if strong duality holds and the 
primal problem is convex - both true for this problem - then the above conditions (118) are equivalent 
to the Karush-Kuhn-Tucker (KKT) conditions for optimality, which in this case are both necessary and 
sufficient [5, §5.5.3]. Thus, any feasible POVM set which satisfies (118) is optimal. 
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A.2 Optimality conditions for worst-case a posteriori performance 

As shown in [5, §4.2.5], a solution to the quasiconvex optimization problem (38) can be obtained by 
solving a series of convex feasibility problems together with a bisection method. We start with the 
equivalence, 



Problem (38) is then equivalent to. 



f TvOiAi{5)<Q 

(119) 



minimize S 

subject to Tr OiAi{5)<0 (120) 
j:T=iOi = In, Oi>0, i = l,...,m 

where the variables are now the real scalar 6 as well as the POVM matrices {Oi G C"^"}. The algorithm 
below requires knowing an upper and lower bound on the optimal (5°^'. Without loss of generality we 
can normalize the weights so that < Wi < 1. Since the objective is a weighted error probability, the 
feasible range is < (5°p* < 1. The bisection algorithm as presented in [5, §4.2.5] now becomes: 

Bisection-Feasibility Method 

given Smin = 0, Smax = 1> tolcrancc e > 0. 
repeat 

1. 5 = ((5min + (^max)/2 

2. Solve the convex feasibility problem 

find Oi, i = 1, . . . ,m 

subject to Tr OiAi{6)<0 (121) 
Yl'iLlOi = In: Oj > 0, i = l,...,m 

3. if feasible, 5amx = else c^min = S 

until dmax - Smin < €• 

The feasibihty step is equivalent to solving the following SDP in the variables (s, Oi): 

minimize s 

subject to Tr OiAi{5)<s (122) 
j:T=iOi = In, Oi>0, i = l,...,m 

Let s°P*, 0°^* denote the optimal solution. Under the temporary assumption that Tr 0°^^p > 0, the 
inequality Tr 0°^^Ai{S) < s°p* is equivalent to. 

Impost = max Will- t < <^ + — „ „opt (123) 

» Y Tr p J mmj Tr p 

It follows that if s'^^^ > then 6 is feasible, and hence, 5°p* < 6. If s°p* < then 6 is infeasible, i.e., 
gopt y ^ jYie optimal value (5°p* is clearly the solution to s°p*(5°p*) = 0. The bisection algorithm 
together with using an interior-point method to solve the SDP (122) will return a value of S to within any 
desired, but finite, accuracy of the optimal. 
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The key computational step is solving the feasibiUty problem (122). High quaUty code which uses 
an interior-point method is recommended such as those found in SDPSOL [43] or SEDUMI [42]. In many 
cases the optimal POVM matrices are rank deficient which may result in a large condition number in the 
linear equations to be solved in the Newton step. This should not be a problem for well conceived code. 

To obtain the optimaUty conditions we will now apply Lagrange DuaUty Theory to the feasibiUty 
problem (122) in the Bisection-Feasibility method. Problem (122) is the primal problem. As previously 
noted, the primal optimal value, s°p*((5), determines if d is feasible. Specifically, 

gopt yo ^ s< (5°P* (124) 
^opt = ^ (5 = (5°P* 

The Lagrange function associated with the primal problem (122) is, 

m m 

L{s,0,X,Z,Y) = s + J2 [Xi{TrOiAi{d)-s)-TrZiOi^+TrY(^In-J2 0i) (125) 

i=l i=l 

with Lagrange multipliers Aj G R, Aj > for the inequality constraint Tr OiAi{S) < s, Zi G 
C"'''*, Zi > for the inequality constraint Oi > 0, and Y G R"""", Y = Y'^ for the equahty 
constraint YaLi Oi = In- The Lagrange dual function is then, 

g{\,Z,Y) = mSL{s,0,X,Z,Y) 

Tr^ EIli Ai = l, AiA(5)-Zi-^ = 0, i = l,...,m (126) 
—oo otherwise 

The Lagrange dual problem establishes the largest lower bound from, 

maxunize ^(A, Z, Y) 

subject to Xi > 0, Zi > 0, i = 1, . . . ,m 

where the optimization variables are {X,Z,Y). Using (126) we can eliminate the Zi variables and write 
the dual problem explicitly in terms of the Aj and Y variables as, 

maximize Tr Y 

subject to Xi > 0, XiAi{S) - Y >0, i = l,...,m (128) 

j:t=i = 1 

The dual optimal solution is (A°p*, y°P*). Strong duality also holds for this problem because Slater's 
condition holds [5, §5.2.3]: there exist strictly feasible (s, Oi), such that Tr OiAi{6) < s, Oi > 0, i = 
1, . . . , m, Yl^i Oi = In- Since the primal (feasibiUty) problem is convex, the optimal primal and dual 
objective values are equal, 

Tr y°P* = s°P* (129) 
Strong duaUty also impUes the following complementary slackness conditions, [5, §5.5.2], 



^opt ^ 0°Pt^.(5) _ s°Pt = 0, ?; = 1, . . . , m 
^opt^opt ^ (xf'Ai{5) - y°Pt) 0°P' = 0, i = 1, . . . , m 

The last line uses Zi = XiAi{5) - Y from (126). Combining YT=i 0°^^ = I with (A°P*^i((5°P*) - 
yopt^Qopt = 0, i = 1, . . . , m gives, 

m 

yopt ^ A°P*y4i(5)0°P* (131) 

i=l 



27 



We now put all the primal and dual equality and inequality constraints together at the optimal S = (5°p*, 
^opt _ rpj. yopt _ ^j^j (131) to eliminate y°pt. To simpUfy notation we drop the superscript 
(•)°P* from all the variables (O, A, Y, Z, 6). This gives: 

Er=i 

XiTrO^M6) 

j:t=i Xi 



= I 

> 0, i = l,. 
= 0, i = l,. 

> 0,1 = 1,. 

= 0,i = l,. 

> 0,1 = 1,. 

= 1 



, m 
, m 
, m 

, m 

, m 



(132) 



These can also be estabUshed directly from the KKT conditions for optimality which in this case are 
both necessary and sufficient [5, §5.5.3]. For the linear constraints, either the constraint is active, 
Tr Ai{6)0i = 0, Xi > 0, or inactive, Tr Ai{d)Oi < 0, Aj = 0. Combining this with (132) gives 
the optimality conditions in (43). 

Suppose the weights are all equal with Wi = 1, \fi. Then, 



Ai^) =1P- PiPi = Ail) 



(133) 



with 7 = 1 — 5. From now on we will use Ai{^) or Ai{5) as appropriate to the context. 

Suppose the optimal is achieved by only one constraint, that is, for i = k,Tr OkAk{^) = and for 
i k,Tr OiAi{'y) < 0. Then, Afc = 1, Xi^k = and the optimahty conditions (132) reduce to. 



Aki^)iI-Ok) > 

Ak{l)Ok < 

Ak{7)iI-Ok)Ok = 

Aki7)OkOi = 0,i^k 

TrOkAki^) = 



(134) 



Since p > by assumption (21), 



det^fe(7) = det (pV2(^7_p,p-i/2^,p-i/2)^i/2) 
= (det p) lYJ=i (7 - Pk^^kj) 

with u!kj, j = 1, ... ,n the eigenvalues of p~^/^pfcp~^/^. Because p > 0, pk > 0, they are all non- 
negative and maxj ujkj > 0. Let j = Pk meiXj Ukj, or equivalently. 



1 =Pk 0"max(p ^^^PkP ^^^) 



(135) 



where (Tmax(') is the maximum singular value of the matrix argument. With this choice det ^^(7) = 
and hence A^ (7) = has the decomposition: 



Ml) = [Uk+ Uko] 



^k+ 



On-1 





k+ 



kO J 



(136) 



Qk+ = diag(a;i, . . .,LOn-i), uji > L02 > ■ ■ ■ > a;„_i > 0, a;i > 
for unitary [Uk+ Uko] with Uk+ G C"^""^ and Uko e C^^^ Setting, 

Ok = UkoU^Q, I — Ok = Uk+Uk+ 



(137) 
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gives Ak{^)Ok = Uk+^k+Uk+ > 0, Afc(7)(/ - Ok) = 0, Ofc(/ - Ofc) = 0, thus satisfying the 
optimality conditions. Observe also that Ok is a rank 1 projector, and / — is a rank n — 1 projector. 
Also, I — Ok = Ylijtk ^'^^ hence is the sum of the remaining m — 1 POVM elements. These are thus 
arbitrary except for satisfying (137) with each Oi > 0, i k. 

Since the single active constraint A; can occur for any i = 1, . . . ,m, then, 

7 = min Pia^axip'^^"^ PiP'^^^) (138) 

i=l,...,m 

which establishes (47) as the optimal objective value for equal weights with one active linear constraint. 
More specifically, this means that there is a single index k E 1 . . . ,m such that 7 = Pk<^max{p~^^^ PkP~^^^) < 
PiCrms.Ap~^^^PiP~^^^), / k. 

The same procedure involving the decomposition (136) is used to arrive at the results for single pure 
state detection with weights wi = 1, 102 = given by (54)-(56). 



A.3 Optimality conditions for worst-case a posteriori performance with noisy measure- 
ments 

To apply the Bisection-Feasibility Method as described in the previous section, replace Oj with O^™®^ 
everywhere in (122). Thus the primal (feasibility) problem becomes. 



minimize s 

subject to Tr Of'^^Ai{5) < s 



ni 



(139) 



Ei^i Oi = In, Oi>0,i = l,...,m 

The Lagrange function is then, 

m mm 

L{s,0,X,Z,Y) = s + J2 (Tr Of'^'Aj (,5) - s) - ^ TV ZiOi + Tr F (/„ - ^ O,) (140) 

j=l 1=1 i=l 

with Lagrange multipliers G R, > for the inequality constraint Tr Of°^^^Ai{6) < s, Zi e 
Qnxn ^, > for the inequality constraint Oj > 0, and Y G R"^", Y = Y'^ for the equality constraint 
Yll^i Oi = In- EUminating the noisy POVM terms gives, 

(m \ m 
1-J^Ai +J];TrOi(A((^,z^)-^i-^) (141) 
1=1 J i=l 

with the Ai{6, v) given by (65). Although not shown, the optimality conditions (64) can be established 
by repeating, mutadis mutandis, all the steps in the previous section, i.e., formulate the dual problem, 
show that strong duality holds, and so on. 
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